A generalization of Hopcroft-Karp algorithm for 
semi-matchings and covers in bipartite graphs 



Jan Katrenic and Gabriel Semanisin 

Institute of Computer Science, 
P.J. Safarik University, Faculty of Science, 
Jesenna 5, 041 54 Kosice, Slovak Republic, 
jan.katrenic@upjs . sk, gabriel . semanisin@upjs . sk 



Abstract. An (/, g) -semi-matching in a bipartite graph G = (UUV,E) 
is a set of edges M C E such that each vertex u € U is incident with at 
most f(u) edges of M, and each vertex v € V is incident with at most 
g(v) edges of M. In this paper we give an algorithm that for a graph 
with n vertices and m edges, n < m, constructs a maximum (/,<?)- 
semi-matching in running time 0(m • min(y^5^ ue(7 f(u), \J^2, vev g{v)))- 

Using the reduction of [5] our result on maximum (/, g)-semi-matching 
problem directly implies an algorithm for the optimal semi-matching 
problem with running time 0( y/nm log n). 

1 Introduction 

We consider finite non-oriented graphs without loops and multiple edges. In 
general we use standard concepts and notation of graph theory. In particular, 
deg(u) denotes the degree of a vertex tiinG= (V, E). If M C E then degM{u) 
denotes the number of edges of M incident with u. If / is and integer valued 
function defined for all vertices of G and X C V then f{X) stands for the sum 

J2 ve xf( v )- 

Let G = (U U V,E) be a bipartite graph with n = \U\ + \ V\ vertices and 
m = \E\ edges (throughout the paper we consider only non-trivial case with no 
isolated vertices, i.e. n — 1 < m). A semi-matching M of G is a set of edges 
M C E(G), such that each vertex of U is incident with exactly one edge of M. 

Semi-matching is a natural generalization of the classical matching in bipar- 
tite graphs. Although the name of semi-matching was introduced recently in [7], 
semi-matchings appear in many problems and were studied as early as 1970s [9] 
with applications in wireless sensor networks [1,13-15,17] and a wide area of 
scheduling problems [3, 6, 10, 11, 18]. For a weighted case of the problem we refer 
to [4,6, 12,19]. 

The problem of finding an optimal semi-matching (see [7]) is motivated by 
the following off-line load balancing scenario: Given a set of tasks and a set 
of machines, each of which can process a subset of tasks. Each task requires 
one unit of processing time and must be assigned to some machine that can 
process it. The tasks have to be assigned in a manner that minimizes given 
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optimization objective. One natural goal is to process all tasks with the minimum 
total completion time. Another goal is to minimize the average completion time, 
or total flow time, which is the sum of time units necessary for completion of all 
jobs (including the units while a job is waiting in the queue). 

Let M be a semi-matching. The cost of M, denoted by cost(M), is defined 
as follows: 



A semi-matching is optimal, if its cost is the smallest one among the costs 
of all admissible semi-matchings. The problem of computing an optimal semi- 
matching was firstly studied by Horn [9] and Bruno et al. [3] where an 0(n 3 ) 
algorithm was presented. The problem received considerable attention in the 
past few years. Harvey et al. [7] showed that by minimizing cost of a semi- 
matching one minimizes simultaneously the maximum number of tasks assigned 
to a machine, the flow time and the variance of loads. The same authors provided 
also a characterization of an optimal assignment based on cost-reducing paths 
and an algorithm for finding an optimal semi-matching in time 0(n ■ m). It 
constructs an optimal semi-matching step by step starting with an empty semi- 
matching and in each iteration finds an augmenting path from a free {/-vertex 
to a vertex in V with the smallest possible degree. 

The semi-matchings were generalized to the quasi-matchings by Bokal et al. 
[2]. They consider an integer valued function g defined on the vertex set and 
require that each vertex v G V is connected to at least g(v) vertices of U. 

An (/, g)-quasi-matching in a bipartite graph G = (UU V, E) is a set of edges 
MCE such that each vertex u € U is incident with at most f(u) edges of M, 
and each vertex v <G V is incident with at least g(v) edges of M. The authors 
provided a property of lexicographically minimum ^-quasi-matching and showed 
that the lexicographically minimum 1-quasi-matching equals to an optimal semi- 
matching. Moreover they also designed an algorithm to compute an optimal 
(lexicographically minimum) g-quasi-matching in running time 0{m ■ g{V)). 

Similarly, in [2] was defined an (/, g)- semi-matching of G = (£7U V, E), which 
is a set of edges MCE such that every element uofU has at most f(u) incident 
edges from M, and every element v of V has at most g{v) incident edges from 
M. A maximum (/, g) -semi-matching is the one with as many edges as possible. 

The complexity bound for computing an optimum semi-matching was further 
improved by Fakcharoenphol et al. [4], who presented 0{^/n -m- log n) algorithm 
for the optimal semi-matching problem. The algorithm uses a reduction to the 
min-cost flow problem and exploits the structure of the graphs and cost functions 
for an elimination of many negative cycles in a single iteration. 

Recently, in [5] it was presented a reduction from the optimum semi-matching 
problem to the maximum (/, g)-semi-matching, which shows that an optimal 
semi-matching of G can be computed in time 0((n + m + TBDSM(n,m)) Togn) 
where n = \U\ + |V|, m = \E\, and TBDSAi(n,m) is the time complexity of an 
algorithm for computing a maximum (/, l)-semi-matching with f(U) < In. By a 
result of [16], the algorithm designed in [5] yields to a randomized algorithm for 
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optimal semi-matching with a running time of 0(n"), where u> is the exponent of 
the best known matrix multiplication algorithm. Since to < 2.38, this algorithm 
broke through 0(n 2 5 ) barrier for computing optimal semi-matching in dense 
graphs [5]. 

In this paper we present an algorithm for finding a maximum (/, (^-semi- 
matching in running time 0(m ■ min Wf{U), y/g{V)}). For the problem of 
computing an (/, <?)-quasi-matching it gives an algorithm with running time 
Oira^J g(V)). For the maximum (/, l)-scmi-matching we get an complexity up- 
per bound 0(^/n ■ m), which implies a bound 0{y/n ■ m ■ logn) for computing 
an optimal semi-matching of the algorithm presented in [5] . 

2 Augmenting paths and (/, gi)-semi-matchings 

In this chapter we introduce concepts that will be used throughout the remaining 
part of the paper. 

Definition 1. Let f : U — >• N and g : V — > N be mappings. An (f,g)-semi- 
matching in a bipartite graph G — (U U V, E) is a set of edges MCE such 
that degmiu) < f(u) for each vertex u e U, and deg M (v) < g(v) for each vertex 
v G V. 

Definition 2. An (f , g)- semi-matching M of a graph G = (UUV,E) is called 
maximum, if for each (f,g) -semi-matching M' ofG holds \M\ > \M'\. An (f,g)- 
semi-matching M is called perfect, if \M\ = f(U). 

Note, that (1, l)-semi-matching is a matching in a bipartite graph. 

Definition 3. Let G = (U U V, E) be a bipartite graph and H C E. A path P is 
called an H -alternating path, if each internal vertex of P is incident with exactly 
one edge of H n P. 

Definition 4. Let G = (U U V, E) be a bipartite graph and H C E. An H- 
augmenting path P is an alternating path with the first and last vertex of P not 
incident with an edge of H (~l P. 

Definition 5. Let G = (U U V, E) be a bipartite graph, H C E, P be an H- 

alternating path and E(P) be the edge set of P. We define an operator as 
follows: 

H®P=(HU E(P)) \ (E(P) n H). 

The next theorem provides a characterisation of maximum (/, (7)-semi-matching. 

Theorem 1. Let M and M' be an (f ', g)- semi-matching of a graph G, \M'\ > 
\M\. Then there exists an M -augmenting path P with endvertices u GU,v <eV , 
deg M {u) < f(u) and deg M (v) < g(v) such that E(P) CMUM', 
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Proof. We proceed by an induction on the size of \M\. Evidently, the assertion of 
the theorem is true for the smallest cases. Now, we may assume that MOM' = 0, 
otherwise the assertion follows from the induction hypothesis. Let us put 

A = {v £ V : deg M {v) < deg M >{v)}. 

Let Va be the set of vertices of V for which there exists an M-alternating 
path starting in a vertex of A with and edge of M'. Here a path of length is 
considered to be an M-alternating path, therefore A £Va- 

Let Ua be the set of vertices of U for which there exists an M-altcrnating 
path starting in a vertex of A with an edge of M'. 

Let us put V B = V \ V A and U B = U \ U A . For sets X C U and Y C V we 
introduce parameters m(X,Y) = \E(G[X U Y}) n M| and m'(X,Y) = \E(G[XU 

Y])nM'\. 

From the definition of Vb we get ui(Ua, Vb) = and the definition of Ua 
yields m' (Ub,Va) = (otherwise the existence of such an edge implies an exis- 
tence of an M-alternating path starting at a vertex of A by edge of M'). This 
is depicted on Figure 1. 

C Ua ^ C Ub ^ 

^ -'i m; - ' 

C ^4 ^ ( f ^ Vb ^ 

Fig. 1. The vertices of G are divided into 4 parts. The edges between £7b and Va 
cannot belong to M' , the edges between Ua and Vb cannot belong to M. 



Since |M| < |M'|, we have m(U,V) < m'(U,V). Moreover m(U A ,V B ) = and 
m! {Ub,Va) =0 which gives 

m(U A , V A )+m(UB,VA)+m(UB,V B ) < m'{U A , V A ) + m'(U A , V B )+m'(U B , V B ) 

(1) 

Since A (~l Vb = and m(U A , Vb) — 0, we get the inequality 

m(C/ B , Vb) > m'([/4, Vs) + Vb). (2) 

By (1) and (2) we get 

m(U A , V A ) + m(U B , V A ) < m'{U A , V A ). (3) 

Trivially, we have the following 

m(UB,V A )>-m'(U A ,V B ). (4) 
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Combining (3) and (4) we obtain 

m(U A , V A ) < m'(U A , V A ) + m'{U A , V B ). (5) 

From the inequality (5) we can conclude that U A contains a vertex u with 
degM(u) < degM'(u). By the definition of U A , it implies an existence of an 
M-augmenting path with endvertex u and an endvertex from A. 

Theorem 2. A (/, g)- semi-matching M of a graph G — (UUV,E) is maximum 
if and only if there exists no M-augmenting path P with endvertices u e U,v e V , 
deg M (u) < f(u) and deg M (v) < g(v). 

Proof. Suppose to the contrary that there is a maximum (/, g)-semi-matching 
M and M-augmcnting path P with endvertices u <G U, v € V and degu (u) < 
f(u), degM(v) < g(v). Then obviously M © P is an (/, g)-semi-matching with 
\M@P\ > \M\. 

The opposite direction comes from Theorem 1. 

The next theorem provides more information about the structure of aug- 
menting paths. 

Theorem 3. Let M and M' be (/, g)- semi-matching s of a bipartite graph G 
such that \M'\ — \M\ — k > 0. Then there exist k edge-disjoint M-augmenting 
paths Pi,P 2 , ■ ■ ■ , Pk such that M © P x © • • • © P k = M'. 

Proof. We prove the theorem by induction on the size of the graph G. The 
assertion obviously holds for the smallest possible cases. If M n M' ^ 0, then 
G \ (M n M') and M \ M' , M' \ M is an instance of theorem of smaller size and 
the claim follows from induction hypothesis. 

Suppose now M n M' — 0. Using Theorem 1, there exists an M-augmenting 
path P such that its edges alternatively belongs to M' and M. Therefore \M' \ 
E(P)\-\M\E(P)\ = k-1 and (M®P)nE(P) = M'nE(P). Consider now the 
graph G\E(P) and edge sets M\E(P), M'\E(P). From the induction hypothesis 
there exist k-1 edge disjoint paths Pi, ... , Pk-i such that (M \ E(P)) © Pi © 
. . . Pfe_i = (M' \ E{P)). Clearly, P is edge disjoint with Pi, ... , P fe _i and 

M' = (M' n E(P)) U (M' \ E(P)) 

= ((M © P) n E{P)) U ((M \ E(P)) © P 1 © . . . P fe _i) 
= MffiPiffi...P fc _iffiP. 

Proof. Corollary 1. Let M and M' be an (f,l)-semi-matchings of a bipartite 
graph G such that \M'\ — \M\ = k > 0. Then there exist k M-augmenting paths 
Pi, P 2 , . • • , P fe such that M' = M Pi 8 • • • 8 P k and E(P,) n E(Pj) = 7 for 
each i,j G {1, 2, . . . , k}, i ^ j. 

It follows from Theorem 3 and the fact degM{v) < 1, v £ V that no two of 
those M-augmenting paths may overlap in a vertex v e V. 
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Let M be an (/, g)-semi-matching of a bipartite graph G = (U\JV, E). Denote 
by V^j = {v G V : deg M (v) < g(v)}. We set adist M (x) to be the length of a 
shortest M- alternating path starting in any vertex of and ending in x. If no 
such M-alternating path exists, we put adistM^x) = +00. 

Theorem 4. Let M be an (/, g) -semi-matching of a bipartite graph G = (U U 
V, E) and P be a shortest M -augmenting path. Then adistM(x) < adistM®p{x) 
for each vertex x £ U Li V . 

Proof. Assume to the contrary that there exists at least one vertex x such that 
adistM{x) > adistMBp{x). Let us choose such a vertex x with the smallest 
possible value of adistM(x). It means that for each vertex y with adistM^y) < 
adistM^x) the inequality adistM(y) < adistM®p{x) is valid. 

Clearly adistM®p{x) cannot be 0, because in such a case x is a vertex of V 
for which degM®p{x) < g(x) and that is why adistM(x) must be zero as well. 

Thus, adistM($p{x) is at least 1. Let y be the predecessor of a; in a shortest 
(M©P)-alternating path starting in a vertex of V^ 0P . Obviously adistM®p{y) + 
1 = adistM®p{x). It also holds that adistM(y) < adistM®p{y) (otherwise 
x was not chosen correctly), what together with the previous equation gives 
adistM(y) < adistM®p{x). Together with the initial inequality for y we ob- 
tain adistM{y) < adistM®p{x) < adistM(x). This implies that the edge xy 
was changed, i.e. xy G P (otherwise the edge xy could be used to violate the 
inequality adistM(v) > adistM®p{v))- Let us distinguish now two cases: 

Casel. x G U and y G V. As y is the predecessor of x in an (M © P)-alternating 
path starting at V^^ p , it implies that the edge yx ^ M © P and yx G M. 
Now let us consider the path P. The path P was the shortest M-alternating 
path starting at Vj^. Since adistuiy) < adistM(x) and xy G P the path P 
must visit the vertex y before x. However, in such a case, by the definition of an 
alternating path starting at V, the edge going from V to U must be unmatched, 
a contradiction. 

Case 2. x G V and y G U. As y is a predecessor of x in an (M © P)-alternating 
path started at V^ mP , it implies that yx G - M © P, consequently yx £ M. The 
path P was the shortest M-alternating path started at V^. Since adistM(y) < 
adistM(x) and xy G P the path P must first visit the vertex y and then x. 
However, in such a case, from the definition of an alternating path starting at 
V, the edge going from V to U must be matched, a contradiction 

3 The algorithm for finding a maximum 
(/, g) -semi-matching 

In this section we describe an algorithm for solving the following problem: 

Problem 1. Given a bipartite graph G = (t/U V, E) and two mappings /:[/—>• N 
and g : V — > N. Find a maximum (/, g)-semi-matching of G. 
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In order to simplify the notation, for an (/, <7)-semi-matching M of a bipartite 
graph G = (UUV, E) and for each vertex of u e UUV we introduce the parameter 
cm(u) as follows: 

c (u) - \ ~ de 3M{u) if u € J7, 
|.g(u) - deg M (u) if u G V. 

We denote by M/ iff -augmcnting path an M-augmenting path with endvertices 
u G U, v e V, such that cm(m) > and cm{v) > 0. 

Our algorithm applies the same scheme as the well-known algorithm of 
Hopcroft-Karp [8]. We start with an empty (/, #)-semi- matching M and in each 
iteration we extend M by several augmenting paths. The length of a shortest 
M/ j9 -augmenting path increases after each iteration and each iteration of the 
algorithm consumes 0(m) time. 
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Fig. 2. The vertices of G classified into layers 



One iteration of the algorithm finds a smallest number t for which an M/ jS - 
augmenting path of length t exists. Next, the algorithm extends M by several 
augmenting paths in a single iteration, while there is an augmenting path of 
length t. More precisely: 

1. Let L = {v e V : c M {v) > 0}. 

2. In terms of Breadth-First Search algorithm, classify vertices of G into layers 
Li, L2, ■ ■ ■ , L n such that Li = {v e U U V : adistuiv) = i}- This can be 
implemented as follows: 

For each i = 0, 2, 4, . . . , 2|n/2j do 

L i+ i = {u e U : u £ L , . . . , i A 3v e L l : uv £ M} 

L i+2 = {v e V : v <£ L , . . . , Li-i A 3u e L i+1 : uv e M} 
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3. Let t > be a smallest odd number such that there exists u G L t : cm{u) > 
0. If no such t exists, by Theorem 2 there is no Mf g - augmenting path. 
The algorithm stops and M is a maximum (/, <7)-semi-matching, otherwise 
continues by step 4. 

4. For each vertex u <G L t while Cm{u) > do: 

(i) Find arbitrary M/ i9 -augmenting path P of length t starting in u such 
that V(P) C L Q ,L u ...,L t . 

(ii) If such a path P exists, set M := M P and recalculate values of cm 
along the path P. 

Theorem 5. The length of the shortest augmenting path increases after each 
iteration of the algorithm. 

Proof. An iteration which processes an (/, g)-semi-matching M stops when there 
is no .M/^-augmcnting path consisting of vertices of Lq \JL\ U • • • UL t . It remains 
to prove, that after such an iteration there is no augmenting path of length t in 
the graph G (a path of length less than t cannot appear due to Theorem 4 and 
the fact that all vertices in layers L\, L2, ■ ■ ■ , Lt-i have zero capacity). 

Suppose to the contrary, that after the iteration there is an M'j ^-augmenting 
path P = {v , vi, . . . , Vt} of order t in G. Since all the vertices of V^, are located 
in L , v € Lq. Since P is an alternating path starting by a vertex of L , then 
adistM'(vi) < i, for each i = 0,1,..., t. According to Theorem 4, the value 
of adist cannot decrease after iteration, i.e. adistMivi) < adistM'{vi) for each 
i = 0, 1, . . . , t. Hence, each vertex of P appears in L U L\ U . . . L t and such an 
augmenting path was not processed during the iteration of the algorithm, which 
is a contradiction. 

3.1 The running time 

Let n be the number of vertices in a given graph G and m be the number of 
its edges, assume that m > n — 1 since isolated vertices can be erased from the 
graph in linear time. 

The algorithm starts with an empty (/, g)-semi-matching M and then iterates 
several times until at least one augmenting path is found. In the search loop, 
the algorithm classifies the vertices into layers L , Li, . . . , L t and modifies M 
by augmenting paths using vertices of Lq, L\, . . . , L t . This step consumes 0(m) 
time, since each edge is manipulated at most once during one iteration. No more 
iteration is performed whenever no augmenting path was found in the actual 
loop. 

The key part of the complexity analysis is to enumerate the number of loops 
of the algorithm. Let s be the size of a maximum (/, c/)-semi-matching M * . After 
performing y/s iterations of the algorithm, according to Theorem 5, the shortest 
M-augmenting path consists of at least y/s vertices. According to Theorem 3 
there exist s — \M\ edge disjoint M-augmenting paths that can simultaneously 
extend M to size s and those paths consist only of edges of M U M* . As each 
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such a path must be of length at least y/s and \M U M*\ is at most 2s, these 
imply that s — \M\ < 2y/s. Since in each loop the algorithm finds at least one 
augmenting path, the algorithm surely stops after at most 2y/s loops. Hence, 
the total number of performed loops is 0(y/s) and the algorithm runs in time 
0(ro ■>/!)■ 

Moreover s < f(U) and s < g(V) and we get that the algorithm computes a 

maximum semi-matching in running time O (m ■ min{^//(J7), \/ g(V)}j . For the 

case of (/, l)-semi-matching this gives the complexity upper bound 0(y/n ■ m). 

To find an arbitrary (/, g)-quasi-matching one can use the algorithm for max- 
imum (/, t/)-semi-matching problem which computes a maximum (/, ^-semi- 
matching M. Clearly, if \M\ < f(U) then no (/, _g)-quasi-matching exists, oth- 
erwise M is an (/, g)-quasi-matching. Moreover, for an (/, g)-quasi-matching we 
may assume f(U) > g(V) (otherwise no (/, g l )-quasi matching exists), we get the 
algorithm with running time 0(my/g(V)). 
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